Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wall-time/all tasks profiler #55889

Merged
merged 1 commit into from
Oct 25, 2024
Merged

Wall-time/all tasks profiler #55889

merged 1 commit into from
Oct 25, 2024

Conversation

d-netto
Copy link
Member

@d-netto d-netto commented Sep 26, 2024

One limitation of sampling CPU/thread profiles, as is currently done in Julia, is that they primarily capture samples from CPU-intensive tasks.

If many tasks are performing IO or contending for concurrency primitives like semaphores, these tasks won’t appear in the profile, as they aren't scheduled on OS threads sampled by the profiler.

A wall-time profiler, like the one implemented in this PR, samples tasks regardless of OS thread scheduling. This enables profiling of IO-heavy tasks and detecting areas of heavy contention in the system.

Co-developed with @nickrobinson251.

@d-netto
Copy link
Member Author

d-netto commented Sep 26, 2024

An additional note: this kind of wall-time/all tasks profiler is also implemented in Go (and denoted as goroutine profiler there), so there is some precedent for this in other languages as well: https://github.com/felixge/fgprof.

@d-netto
Copy link
Member Author

d-netto commented Sep 26, 2024

@nickrobinson251 I can't assign you as reviewer... Feel free to assign yourself or post review comments otherwise.

@d-netto d-netto force-pushed the dcn-all-task-profiler branch 2 times, most recently from 6b80fe3 to f5c8f5f Compare September 26, 2024 14:13
@IanButterworth
Copy link
Member

I think this is related to #55103. Could the metrics here be useful in that too?

@d-netto d-netto force-pushed the dcn-all-task-profiler branch 2 times, most recently from 1e9f41f to 6cd27d7 Compare September 26, 2024 16:02
@nsajko nsajko added the feature Indicates new feature / enhancement requests label Sep 26, 2024
@d-netto d-netto force-pushed the dcn-all-task-profiler branch 3 times, most recently from f6ea007 to 1029a84 Compare September 27, 2024 11:38
@d-netto d-netto force-pushed the dcn-all-task-profiler branch 3 times, most recently from 0d4ca9c to e493403 Compare September 30, 2024 11:42
@d-netto
Copy link
Member Author

d-netto commented Sep 30, 2024

I think this is related to #55103. Could the metrics here be useful in that too?

For diagnosing excessive scheduling time? I can't immediately see how this PR would be useful for that.

@d-netto
Copy link
Member Author

d-netto commented Sep 30, 2024

For diagnosing excessive scheduling time

#55103 seems like a much more direct approach for doing so, at least.

@d-netto d-netto force-pushed the dcn-all-task-profiler branch 2 times, most recently from 90bca24 to 14766d3 Compare September 30, 2024 14:57
src/signals-unix.c Outdated Show resolved Hide resolved
@NHDaly NHDaly added the needs tests Unit tests are required for this change label Oct 1, 2024
src/task.c Outdated Show resolved Hide resolved
@d-netto d-netto force-pushed the dcn-all-task-profiler branch 2 times, most recently from 5ddd5ba to c9d1995 Compare October 3, 2024 12:50
@d-netto d-netto force-pushed the dcn-all-task-profiler branch 8 times, most recently from 6c033d6 to 7e9966e Compare October 25, 2024 14:05
@d-netto
Copy link
Member Author

d-netto commented Oct 25, 2024

Wrote this short demo on how to use the profiler: https://github.com/d-netto/Wall-time-Profiler-Demo.

Feedback is welcome.

Perhaps we should add this to some documentation page in Julia itself.

src/signal-handling.c Outdated Show resolved Hide resolved
src/stackwalk.c Outdated Show resolved Hide resolved
src/stackwalk.c Outdated Show resolved Hide resolved
@vtjnash
Copy link
Member

vtjnash commented Oct 25, 2024

Can you add this to the Profile docs and the NEWS file?

@d-netto
Copy link
Member Author

d-netto commented Oct 25, 2024

Can you add this to the Profile docs and the NEWS file?

Sounds good, will do. Thanks for the reviews.

@d-netto d-netto force-pushed the dcn-all-task-profiler branch 5 times, most recently from 7ece17a to 50a3ce5 Compare October 25, 2024 20:14
@d-netto d-netto merged commit f6a38e0 into master Oct 25, 2024
7 checks passed
@d-netto d-netto deleted the dcn-all-task-profiler branch October 25, 2024 22:51
@JZL
Copy link

JZL commented Oct 27, 2024

This looks really interesting, I'm excited to use it!

To confirm, there's no way to use this with SIGUSR1/SIGINFO signalling, right? I've been trying to make sense of the code but am unsure -- am I correct that it's this line's [1] hardcoded 0/false which forces it to always be the old version?

Are there plans to make this modifiable using e.g. env variables like for heap snapshots or function references like peek_report.

[1]

if (jl_profile_start_timer(0) < 0)

@nickrobinson251
Copy link
Contributor

there's no way to use this with SIGUSR1/SIGINFO signalling, right?

Your understanding is correct.

Are there plans to make this modifiable using e.g. env variables like for heap snapshots or function references like peek_report.

i think that could be a good addition! Perhaps open a feature request issue for it? We might even want both e.g. I suppose an ENV variable could control whether to take a cpu profile or a wall-time profile, but if you want both it might be best to use peek_report, so we might want to add an easy, official way to do that, and document it. Right now i think you'd have to use the non-public function Profile.start_timer(true). Worth an issue to discuss, i think :)

Also, you might also be interested in #56043 (which depending on your use-case might be even better than a wall-time profile, since it'll print the backtrace for all tasks not just a sample)

@d-netto d-netto mentioned this pull request Oct 27, 2024
3 tasks
d-netto added a commit to RelationalAI/julia that referenced this pull request Oct 27, 2024
One limitation of sampling CPU/thread profiles, as is currently done in
Julia, is that they primarily capture samples from CPU-intensive tasks.

If many tasks are performing IO or contending for concurrency primitives
like semaphores, these tasks won’t appear in the profile, as they aren't
scheduled on OS threads sampled by the profiler.

A wall-time profiler, like the one implemented in this PR, samples tasks
regardless of OS thread scheduling. This enables profiling of IO-heavy
tasks and detecting areas of heavy contention in the system.

Co-developed with @nickrobinson251.
d-netto added a commit to RelationalAI/julia that referenced this pull request Oct 27, 2024
One limitation of sampling CPU/thread profiles, as is currently done in
Julia, is that they primarily capture samples from CPU-intensive tasks.

If many tasks are performing IO or contending for concurrency primitives
like semaphores, these tasks won’t appear in the profile, as they aren't
scheduled on OS threads sampled by the profiler.

A wall-time profiler, like the one implemented in this PR, samples tasks
regardless of OS thread scheduling. This enables profiling of IO-heavy
tasks and detecting areas of heavy contention in the system.

Co-developed with @nickrobinson251.
d-netto added a commit to RelationalAI/julia that referenced this pull request Oct 29, 2024
One limitation of sampling CPU/thread profiles, as is currently done in
Julia, is that they primarily capture samples from CPU-intensive tasks.

If many tasks are performing IO or contending for concurrency primitives
like semaphores, these tasks won’t appear in the profile, as they aren't
scheduled on OS threads sampled by the profiler.

A wall-time profiler, like the one implemented in this PR, samples tasks
regardless of OS thread scheduling. This enables profiling of IO-heavy
tasks and detecting areas of heavy contention in the system.

Co-developed with @nickrobinson251.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Indicates new feature / enhancement requests profiler
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants